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i.  INTRODUCTION 


A  Hidden  Markov  Model  (HMM)  can  be  considered  a  state  machine  in  which 
state  transitions  and  state  outputs,  or  observations,  are  probabilistic.  HMM’s  are  used  to 
leam  and  classify  sequences  of  observables.  HMM  technology  has  been  used 
successfully  in  a  diverse  set  of  applications,  such  as  speech  recognition  [Da,  Pi],  Gene 
prediction  [Ra],  and  Cryptanalysis  [Si]. 

Because  of  the  probabilistic  nature  of  the  underlying  process  being  observed  by 
HMM’s,  they  are  not  used  often  to  recognize  long-periodic  sequences.  Rather,  they  are 
mostly  used  as  discriminators,  to  detennine  whether  one  HMM  is  better  than  another.  For 
example,  an  HMM-based  speech  recognition  system  may  have  each  HMM  represent  a 
word,  with  run  time  voice  recognition  choosing  the  HMM  that  best  fits  the  incoming 
sequence  of  speech  features.  This  is  in  contrast  with  Detenninistic  Finite  Automata 
(DFA)  [HWU],  Finite  State  Machines  (FSM’s)  [KJ],  or  Harel-Statecharts  [Ha,  Dl,  D2], 
which  are  often  used  to  identify  and  classify  individual  sequences.  Stated  differently, 
because  HMM’s  identify  individual  sequences  of  external  observables  with  a  relatively 
low  probability,  it  is  usually  not  perceived  as  convincing  evidence  of  the  occurrence  of  a 
particular  sequence. 

Run-time  Verification  (RV)  of  formal  specification  assertions  is  a  class  of 
methods  for  monitoring  the  sequencing  and  temporal  behavior  of  an  underlying 
application  and  comparing  it  to  the  correct  behavior  as  specified  by  a  fonnal 
specification  pattern.  Some  published  RV  tools  and  techniques  are:  the  TemporalRover 
and  DBRover  [D3],  PaX  [HR]  and  RT-Mac  [SLS],  all  of  which  use  extensions  and 
variants  of  Propositional  Linear-time  Temporal  Logic  (PLTL)  as  the  specification 
language  of  choice,  and  the  StateRover  [SR]  that  uses  deterministic  and  non- 
detenninistic  statechart  diagrams  as  its  specification  language.  In  [D2],  Drusinsky 
describes  the  application  of  RV  using  statechart  assertions  to  the  verification  of  DoD  and 
NASA  applications,  and  to  those  of  the  Brazilian  Space  agency. 

In  this  paper,  we  use  HMM’s  to  identify  hidden  events  and  sequences  thereof. 
However,  we  will  not  be  using  the  (rather  small)  probability  of  an  observable  sequence, 
but  rather  the  probability  of  a  hidden  state  being  reached  given  a  sequence  of 
observables.  Hence,  the  technique  identifies  hidden  events  with  a  relatively  high 
probability. 

This  paper  describes  a  pattern  detection  technique  suitable  for  financial  systems 
in  which  not  all  artifacts  are  necessarily  observable.  The  technique  is  a  novel 
combination  of  Hidden  Markov  Models  (HMM’s)  with  RV  techniques  for  probabilistic 
pattern  matching  of  statechart  patterns.  Throughout  the  paper,  we  will  be  using  the 
Statechart  assertion  formal  specification  language  of  [Dl,  D2],  We  will  show  a 
probabilistic  variant  of  this  formalism  suitable  for  pattern  detection  within  systems  with 
hidden  inputs. 

The  technique  in  this  paper  is  not  positioned  as  a  method  for  achieving  financial 
gains  in  financial  markets.  Various  papers  investigating  and  analyzing  such  statistical 
techniques  can  be  found  in  the  literature,  using  artifacts  such  as  long-term  memory 
[MBH],  self  similarity  [GMP],  and  fat-tailed  distributions  [SCLC];  in  fact,  power  laws 
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are  used  to  classify  time  series  sequences  in  many  other  fields  besides  financial  systems 
[L,  TV,  LC,  LZ], 

Rather,  the  technique  suggested  in  this  paper  is  positioned  as  a  hybrid  pattern 
detection  technique  that  combines  patterns  written  by  humans  with  statistical 
observations  -  manifested  as  HMM’s.  In  other  words,  it  is  positioned  as  a  hybrid 
between  fonnal  specification  and  run-time  verification  techniques  (e.g.,  [Dl,  D2,  DMS]) 
and  statistical  pattern  detection.  Section  8  addresses  the  possibility  of  extending  our 
approach  to  utilize  some  of  the  above  mentioned  statistical  techniques  such  as  long-tenn 
memory,  fat  tailed  distributions,  and  various  other  fractal  properties. 

The  rest  of  the  paper  is  organized  as  follows.  Section  2  provides  an  overview  of 
behavioral  pattern  detection  using  detenninistic  UML  statechart  patterns.  Section  3 
provides  an  overview  of  HMM’s  and  HMM  related  algorithms.  Section  4  describes  our 
proposed  pattern  detection  architecture  and  process  that  uses  a  combination  of  hidden 
and  visible  data,  using  an  HMM  connected  to  a  behavioral  pattern  detection  monitor. 
Section  5  describes  HMM  parameter  estimation  for  the  financial  data  HMM  component, 
and  section  6  describes  the  operation  of  the  pattern  detector.  Section  7  describes  the 
operation  of  the  probabilistic  pattern-matching  monitor,  and  section  8  describes  three 
techniques  for  computing  the  probability  distribution  used  by  that  monitor. 

2.  BEHAVIORAL  PATTERN  DETECTION  USING 
DETERMINISTIC  UML  STATECHART  PATTERNS  -  AN 
OVERVIEW 

Consider  the  following  natural  language  (NL)  patterns  for  a  credit  card  (CC) 
system;  the  NL  pattern  is  specified  as  being  flagged  when  a  scenario  conforms  to  the 
pattern: 

Rl.  Flag  a  customer  whose  average  expense,  over  three  consecutive  non-holiday 
weekend  clothing  related  transactions  is  of  a  Dollar  amount  greater  than  his  or  her  p+a, 
where  //  and  a  are  respectively,  the  mean  and  standard  deviation  of  the  customer’s 
clothing  expenses  during  the  previous  year. 

Figure  1  depicts  a  statechart -pattern  for  Rl.  As  described  in  [D1,D2],  a  statechart  - 
pattem  is  a  state -machine  augmented  with  hierarchy,  flowcharting  capabilities,  a  Java 
action  language,  and  a  built  in  Boolean  flag  named  bFlag  whose  default  value  is  false, 
with  a  true  value  indicating  that  the  pattern  has  been  flagged  (e.g.,  per  pattern  Rl,  flags 
that  the  input  scenario  confonns  to  Rl). 
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Figure  1.  A  statechart-pattern  for  requirement  Rl. 

The  statechart-pattem  of  Fig.  1  combines  flowchart  and  state-machine  elements. 
Rectangular  boxes  and  decision  diamonds  are  flowchart  elements  -  the  statechart  flows 
through  them  while  executing  their  actions  and  conditions,  eventually  resting  on  a  state 
machine  state  like  WaitForTrans action  where  the  statechart  waits  for  an  event.  Hence, 
the  statechart  flows  through  the  Init  flowchart  box,  executes  its  actions  and  waits  in  the 
WaitForTransaction  state.  When  a  user  transaction  occurs  (newTrans action  event,  with 
two  arguments,  a  Transaction  object  and  a  User  object)  the  statechart  checks  whether  the 
argument  is  a  holiday  and  clothing  related  transaction.  If  not,  then  the  statechart  waits  for 
the  next  transaction.  If  it  is,  then  the  statechart  checks  whether  the  transaction  is  a 
weekend  transaction.  If  it  is,  the  statechart  calculates  the  average  amount  spent  on  the 
most  recent  three  such  transactions  this  weekend  (see  the  CalcuIateAverage  flowchart 
box).  If  this  average  exceeds  the  user’s  ju+cr  then  the  pattern  detection  flag  is  raised 
( bFlag  =  true). 

Pattern  matching  is  performed  by  comparing  a  trace  of  the  financial  system  (e.g., 
a  CC  statement  or  bank  log)  to  the  behavior  of  the  pattern  set.  The  StateRover  tool  does 
so  using  a  two  step  process.  First,  a  transaction  log,  or  statement,  is  converted  into  an 
equivalent  JUnit  test  [JU],  and  the  pettem  is  code-generated  into  an  equivalent  Java  class 
(details  about  this  code  generator  are  available  in  [Dl]).  Next  comes  an  RV  step  where 
the  JUnit  test  is  executed,  thus  checking  that  the  transaction  log  conforms  to  the  pattern1. 

The  extended  pattern  matching  technique  suggested  in  this  paper  uses  the  same 
process  for  the  development  of  patterns,  i.e.,  patterns  are  developed  as  deterministic 
patterns.  However,  rather  than  performing  deterministic  RV  by  the  virtue  of  using  a  code 
generator  that  generates  a  deterministic  pattern  implementation,  our  technique  performs 
probabilistic  pattern  detection  using  a  special  pattern  code  generator  that  generates  a 
probabilistic,  weighted  implementation.  Specific  details  are  provided  in  section  6. 

3.  HIDDEN  MARKOV  MODELS 


A  (discrete)  hidden  Markov  model  (HMM)  is  a  statistical  Markov  model  in  which 
the  system  being  modeled  is  assumed  to  be  a  Markov  process  with  unobserved,  or  hidden 


1  Note  that  we  assume  that  for  an  instance  of  the  Rl  pattern  -  i.e.,  an  instance 
object  of  the  Java  class  generated  for  statechart-pattern  of  Fig.  1,  exists  per  user. 
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states.  While  in  a  regular  Markov  model  the  state  is  directly  visible  to  the  observer,  in  a 

hidden  Markov  model  the  state  is  not  directly  visible,  while  the  output,  dependent  on  the 

state,  is  visible. 

The  parameters  of  a  simple  HMM  are  [Ra]: 

•  N,  the  number  of  states  in  the  model.  Individual  states  are  denoted  S  =  {si,  s2,...s,v}, 
and  the  state  at  time  t  as  qt. 

•  M,  the  number  of  distinct  observation  symbols.  Individual  states  are  denoted  V  =  {vi, 
v2,...vM}. 

•  The  state  transition  probability  distribution  A  =  {af\  where  ay  =  P[q,+ 1  =  Sj\qt  =  Sj],  1  < 
i,j  <  N.  Clearly,  Vi,  1  <i<N,  Zi <y<v  ay=  1. 

•  The  observation  symbol  probability  distribution  in  state  j,  B={bj(k)},  where  bfk)  = 
P[vk  at  t 1  q,  Sy] ,  1<  j<N,  1  <k<  M. 

•  The  initial  state  distribution  n  =  1 7i,},  where  ti,  =  P[c/\  =  s,],  1  <i<N. 

Rabiner  [Ra]  describes  the  following  three  primary  problems  associated  with 

HMM’s: 

1.  Given  the  observation  sequence  O  =  OiO^.-.Ot,  and  an  HMM  model  X  =  (A,B,  tt), 
how  do  we  efficiently  compute  P(0|k)? 

2.  Given  the  observation  sequence  O  =  O1O2...OT,  and  an  HMM  model  X  =  (A,B,  tt), 
how  do  we  choose  an  optimal  state  sequence  Q  =  q\  qi-.-qk- 

3.  How  do  we  calculate  the  model  parameters  X  =  (A,B,  71)  to  maximize  P(0|/.)? 

The  most  well  known  algorithms  used  to  solve  these  problems  are: 

1.  The  forward  algorithm,  for  calculating  the  forward  variable  at(i)  =  P( O1O2...O/,  q,  = 
Si  |  A.).  The  forward  algorithm  is  a  dynamic  programming  algorithm  based  on  the 
recurrence: 

0-t+\(J)  =  [Z/=i.jvat(0  ay  ]  b/0,-1 ),  \<t<T-\,  1  <j<N, 
with  the  initialization: 
ai  (/')  =  njbji  Oi). 

Note  that  T,(0i02...0r|k)=Z/=i..va?(/). 
a'  is  the  normalized  version  of  a: 

a  t(J)=P(qt=Si\0\02...0h  X),  calculated  recursively  as: 

a'f+i(/)=W+i(/W0i02...0,|A.). 

2.  The  backward  algorithm,  for  calculating  the  backward  variable  (3t(0  = 
/3(01+i0;+2...0t  Yii  =  Si,  X).  The  algorithm  is  a  dynamic  programming  algorithm  based 
on  the  recurrence: 

P/(0  =  X/-I../V  ay  bj{ 0,+i)  P/+i(/),  for 

t=T-\,T-2,...,\,  and  1  <i<N, 
with  the  initialization: 

Pt(0  =  1,  for  l<i<A. 

3.  The  forward-backward  algorithm,  for  calculating  the  forward-backward  variable 

Yt(0=  P(qt  =  Si\Oi...OT,X). 
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y  is  also: 

yt(0=(a?(0  Hi))!  /3(0,02...07|a) 

4.  The  Viterbi  algorithm,  for  calculating  the  best  state  sequence  that  explains  an 
observation  sequence,  Sx(0i02...0r  |  A).  The  algorithm  defines: 

8t(i)=max[qi,q2,...  qt-i ]  P{qi,q2,...qt=^u  0i02...0,|  X), 

and  uses  the  following  recursive  formula: 

8 ,(/)  =  max  i<j<^r  [8^_i  (/)  af  8/0,) 

along  with  the  following  formula,  used  to  recover  the  actual  most  probable 
state  sequence: 

\|/  t(j)  =  argmaxi<  ,<  N  [8  m(/)  %],  where  \j/ 1(/)=0; 

The  Viterbi  algorithm  is  essentially  the  forward  algorithm  with  a  recurrence  in 
which  a  max  operator  is  used  instead  of  the  sum.  The  probability  of  best  state 
sequence  8[(0 1O2  --O7  |  A)  is  then  the  maximal  8  T(z),  1  <  i  <  N,  and  qT=  argmax,  8 
x (z),  1  <  i  <  N. 

The  most  probable  state  sequence  qj,q2,...qT  is  calculated  in  a  backward 
manner,  using  qt.\  =  \j/  ,(qt). 

4.  PATTERNS  WITH  HIDDEN  DATA 

Suppose  our  financial  system  would  like  to  detect  patterns  that  assert  about  data 
that  is  not  explicitly  present  in  the  list  of  transactions,  such  as  whether  a  transaction  is 
business  related  (albeit  using  a  personal  CC  or  bank  account),  is  personal,  is  suspect  (as 
fraudulent),  or  is  investment  related.  More  specifically,  consider  the  NL  for  pattern  R2. 

R2.  Flag  a  customer  that  for  a  period  of  a  week  with  at  least  two  investment 
transactions,  customer’s  investment  transaction  Dollar  amount  is  20%  higher  than 
customer ’s  personal  transaction  Dollar  amount. 

Figure  2  depicts  a  statechart-pattern  for  requirement  R2.  Note  that  it  asserts  about 
visible  information  (e.g.,  newTransaction  event  and  transaction  amount  data  item)  as  well 
as  hidden  information  ( hmmType ,  being  PERSONAL  or  INVESTMENT). 


Figure  2.  A  statechart-pattern  for  requirement  R2. 

To  enable  pattern  detection  of  a  transaction  log  with  respect  to  R2  and  its 
corresponding  statechart-pattern,  we  apply  the  pattern  detection  architecture  of  Fig.  3. 
Key  components  of  the  architecture  are: 
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1.  It  contains  a  Hidden  Markov  Model  (HMM),  used  to  decode  the  probability  of 
occurrence  of  sequences  of  hidden  states  given  sequences  of  the  observable 
transactions.  The  states  of  the  HMM  are:  Business,  Personal,  Suspect,  and 
Investment,  referring  to  the  four  transaction  types  discussed  earlier.  This  HMM 
provides  a  plurality  of  weighted  transaction  type  inputs  to  the  statechart-pattern, 
weights  reflecting  the  probability  the  corresponding  HMM  state  was  reached  given 
the  observed  transaction  sequence. 

2.  It  uses  a  special  code  generator  that  generates  a  probabilistic  implementation  for  the 
statechart-pattern,  one  that  operates  on  the  weighted  inputs  from  the  HMM. 

3.  It  evaluates  the  pattern  using  a  success  score  in  the  range  [0,1]. 


Pattern 


HMM 


HMM 


tate 


Transaction 


matching 
score - 


Pattern  (e.g.,  Fig. 
2)  with  weighted 
implementation 
(described 
section  6) 


in 


Figure  3.  The  pattern  matching  architecture  for  the  transaction  log  and 

requirement  R2. 

The  HMM  parameters  for  this  example  are  detennined  in  the  learning-phase 

discussed  in  section  5.  They  are: 

•  The  state  set  Q  consists  of  the  four  states  mentioned  above:  Business,  Personal, 
Suspect,  and  Investment,  denoted  as  states  0,  1,2,  and  3,  respectively. 

•  An  observable  O,  which  is  a  triplet  describing  the  data  combinations  required  by  the 
HMM  to  detennine  the  next  state.  For  example  the  Boolean  conjunction: 

isHoliday  A  isAutomotive  A  HsYouthExpense,  means  the  (visible)  transaction 
occurred  during  a  holiday,  is  not  automotive  related,  and  is  related  to  an  expense  a 
young  person  typically  makes.  We  represent  each  observable  as  a  tuple  of  integers 
such  as<2,6,0>,  where  each  integer  component  represents  a  condition. 

Section  5  describes  in  greater  detail  the  set  of  observable  triplets  for  this 
example  and  the  their  associated  learning  process. 

•  State  transition  probabilities,  given  by  the  matrix  A  in  Table  1 . 


Transition  SourceYTarget 

Buss. 

Pers. 

Susp. 

Inv. 

Buss. 

0.35 

0.46 

0.05 

0.14 

Pers. 

0.25 

0.60 

0.06 

0.09 

Suspect 

0.37 

0.39 

0.23 

0.01 

Inv. 

0.35 

0.38 

0.05 

0.22 

Table  1.  Matrix  A  of  HMM  state  transition  probabilities 
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•  Matrix  B,  containing  bs(0),  the  probability  of  an  observable  O  being  observed  in 
state  5,  part  of  which  is  presented  in  Table  2. 


0\state 

Buss. 

Pers. 

Suspect 

Inv. 

<0,1, 0> 

0.01 

0.11 

0.01 

0.03 

<0,2, 1> 

0.03 

0.15 

0.01 

0.0001 

<1,0, 0> 

0.01 

0.22 

0.03 

0.01 

Table  2.  A  part  of  Matrix  B,  of  probability  of  observation  O  in  HMM  state  s. 

•  The  initial  state  distribution  is  [0.2,  0.65,  0.05,  0.1]  for  Business,  Personal,  Suspect, 
and  Investment,  respectively. 

Pattern-detection  now  proceeds  according  to  the  process  illustrated  in  Fig.  3,  as 
follows. 

Transactions  from  the  transaction-log  are  fed  into  the  HMM,  which  then  executes 
a  probability  estimation  algorithm,  such  as  the  forward-algorithm,  for  the  current  iteration 
(section  7  discusses  three  probability  estimation  techniques).  These  probability  values 
represent  probabilities  of  the  HMM  being  in  states  Business,  Personal,  Suspect,  or 
Investment.  This  vector  of  symbols  and  corresponding  probabilities  is  passed  to  the 
pattern’s  implementation  code,  which  executes  a  weighted  version  of  a  state-machine 
state  change,  detailed  in  section  6.  Finally,  as  discussed  in  section  6,  the  pattern  detector 
announces  the  probability  it  flagged  a  pattern  match. 

5.  FROM  PATTERNS  TO  HMM  PARAMETER  ESTIMATION 

HMM  parameter  estimation,  i.e.,  estimating  the  transition  probability  and 
probability  of  state  observations,  is  considered  a  difficult  problem.  In  particular,  it  is 
difficult  to  estimate  the  number  of  HMM  states,  the  extreme  cases  being  using  one  state 
(i.e.,  reducing  the  HMM  to  a  stationary  process)  or  n  states,  n  being  the  length  of  the 
observation  sequence. 

In  our  case  however,  HMM  states  are  known;  they  are  directly  related  to  the 
hidden  pattern  artifacts.  In  our  example,  the  four  hidden  symbols  Business,  Personal, 
Suspect,  and  Investment,  are  derived  from  Fig.  2  and  its  pattern  specification  R2,  as  well 
as  from  other  patterns. 

Our  use-case  for  HMM’s  induces  a  simple  method  for  calculating  transition  and 
observable  probabilities.  Because  HMM  states  relate  to  real  world  artifacts,  we  can 
conduct  learning-phase  experiments,  which  measure  relative  frequencies  using  standard 
frequency  analysis.  The  financial  industry  performs  such  experiments  as  a  matter  of 
business  [Se].  Table  3  illustrates  the  learning-phase  process  with  a  list  of  transactions 
taken  from  the  authors  CC  statement;  the  author  annotated  the  transactions  with  the 
corresponding  HMM  state,  listed  in  the  right-most  column. 
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Date 

Merchant 

Amt 

1 

9/22/11 

WHOLEFDS 

57.25 

P 

2 

9/22/11 

TRADER  JOE’S 

113.35 

P 

3 

9/23/11 

IKEA 

975.86 

P 

4 

9/23/11 

IKEA 

68 

B 

5 

9/23/11 

UNION  76 

38.46 

P 

6 

9/25/11 

M.  LI  DDS 

38.8 

P 

7 

9/30/11 

UNION  76 

25.6 

P 

8 

10/5/11 

DOG  EAR 

PUBLISHING 

175 

B 

9 

10/6/11 

UNITED  AIR 

1666.8 

P 

10 

10/7/11 

MCAFEE 

45.99 

B 

11 

10/8/11 

GREAT  TANS 

49.25 

S 

12 

10/9/11 

CVS 

PHARMACY 

30.27 

P 

13 

10/11/11 

K  APT  HOME 

OWNERS 

ASSOC 

303 

I 

14 

10/13/11 

RADIOSHACK 

13.04 

B 

Table  3.  A  sample  segment  of  the  author’s  transaction  log.  The  rightmost  column 
indicates  the  HMM  state  as  added  by  the  author  in  the  learning  phase. 

With  this  infonnation  the  process  continues  by  identifying  the  combination  of 
visible  transaction-log  data,  in  the  fonn  of  Boolean  conditions,  that  according  to  the 
training  user  induces  the  states  in  the  rows  of  Table  3.  Table  4  presents  this  information. 


Condition 

Condition 

Condition 

1 

Food 

P 

2 

Food 

P 

3 

Furniture 

Weekend 

P 

4 

Furniture 

!  Weekend 

!  Holiday 

B 

5 

Automotive 

P 

6 

Health 

P 

7 

Automotive 

P 

8 

Publishing 

B 

9 

Travel 

Amt>1000 

P 

8 


10 

ITSecurity 

Amt  >  40 

B 

11 

Leisure 

Youth 

Expense 

S 

12 

Health 

P 

13 

Residential 

Non  Local 

I 

14 

Electronics 

Weekday 

B 

Table  4.  Conditions  that  induce  the  HMM  states  in  Table  3. 

The  conditions  of  table  4  represent  data  items  required  by  the  HMM  to  determine 
the  next  state.  In  our  example  these  items  are:  is  Weekend,  isHoliday,  isFurniture, 
isPublishing,  isResidential,  isElectronics,  isF[ealth,  isAutomotive,  isLeisure, 
isYouthExpense,  isITSecurity,  and  Amt  (Dollar  amount).  All  data  items  with  the  prefix  is 
are  Boolean  conditions.  According  to  table  4,  the  Amt  data  can  be  divided  into  3  mutually 
exclusive  segments:  less  than  $40,  between  $40  and  $1000,  and  above  $1000,  denoted  as 
Amt<4o,  Amtpojooo],  and  Amt>iooo,  respectively. 

Note  that  the  number  of  combinations  of  these  data  items  is  large:  3x211=6144. 
However,  most  conditions  are  not  necessarily  orthogonal,  but  are  often  mutually 
exclusive.  We  identify  three  bins  of  mutually  exclusive  conditions: 

1.  Temporal  -  containing  is  Weekend,  and  is  I  loliday  (when  a  certain  day  is  both  we  say 
its  isHoliday). 

2.  Type  of  purchased  object  -  containing  isFurniture,  isPublishing,  isResidential, 
isElectronics,  isHealth,  isAutomotive,  isLeisure,  and  isITSecurity. 

3.  Age  group  for  purchased  object  -  containing  isYouthExpense. 

Using  these  three  bins  we  encode  observables  as  triplets,  such  as:  <2,6,0>  being: 
isHoliday  A  isAutomotive  A  ! isYouthExpense. 

HMM  parameters  follow  from  this  information  in  a  straight-forward  manner.  For 
example,  the  probability  of  a  transition  from  state  Personal  to  state  Business  is  the  ratio 
of  number  transactions  with  a  P  state  whose  next  transaction  is  B  state  to  the  total  number 
of  transactions  with  a  P  state,  being  0.375  for  the  data  in  Tables  3  and  4.  Similarly,  the 
probability  of  observable  <2,6,0>  in  state  Personal  is  the  ratio  of  number  transactions 
with  a  P  state  and  observable  <2,6,0>  to  the  total  number  of  transactions  with  a  P  state, 
being  0.25  for  the  data  in  Tables  3  and  4. 

6.  BEHAVIORAL  PATTERN  MATCHING  IN  THE  PRESENCE  OF 
HIDDEN  DATA 

Using  the  architecture  of  Fig.  3,  the  pattern-matching  module  observes  sequences 
that  consist  of  visible  as  well  as  hidden  artifacts;  in  Fig.  2  for  example,  newTransaction  is 
a  visible  event,  while  hmmType  is  hidden.  Hidden  artifacts  have  an  associated  probability 
distribution  which  we  call  the  probability-of-occurrence  distribution  (POD),  such  as 
POD- 1 :  hm  m  Type = B  US  I  NESS,  PERSONAL,  SUSPECT  or  INVESTMENT  at  time  5 
occurs  with  probability  0.1,  0.8,  0.05,  0.05,  respectively.  Section  7  describes  three 
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techniques,  called  a',  y,  and  8",  for  computing  the  cycle-by-cycle  POD,  based  on  a,  y, 
and  8,  respectively.  We  consider  a  visible  artifact  to  have  a  probability  of  occurrence  of 
1. 

A  weighted/probabilistic  implementation  of  the  statechart-pattern  module  of  Fig. 
3  responds  to  an  input  sequence  1=  <S\,  P\>,  <Si,  P ?>,■■,  <St,  Pt>,  where  S,  is  a  visible  or 
hidden  artifact  (i.e.,  event  such  a  newTransaction,  or  data  artifact,  i.e.,  variable,  such  as 
hmmType,  both  in  Fig.  2),  and  P,  is  the  POD  of  St. 

We  use  the  UML  notation  for  St,  S,=event, [condition,],  where  condition ,  is 
optional;  event,  and  condition,  can  either  or  both  be  visible  or  hidden. 

A  pattern  implementation  consists  of  a  collection  C  of  instances,  or  copies,  of  the 
pattern,  called  configurations.  Each  configuration  executes  as  a  standalone  pattern  and 
preserves  its  own  present-state.  Each  configuration  Con  has  a  probability  measure 
P(Con),  called  the  Configuration  Probability  Measure  (CPM),  that  measures  the 
probability  the  pattern  is  behaving  as  suggested  by  Con,  i.e.,  that  its  present-state  is  Con’s 
present  state.  Upon  startup,  C  consists  of  a  single  configuration  Condefauit  whose  present- 
state,  denoted  PS(Condefauit ),  is  the  pattern’s  default  state  (e.g.,  state  Init  in  Fig.  2),  and 
having  P{Condefauh)=\ . 

All  configurations  of  C  respond  to  a  pair  <S ,,P,>  of  /,  as  follows.  If  P,=  1  then 
the  configuration  performs  a  conventional  state  machine  state  change  upon  input  S,. 
Otherwise,  either  event,  or  condition,  are  hidden.  In  this  case  the  configuration  Con  is 
replaced  with  two  configurations:  Coni  and  Con2,  whose  present-state  probabilities  are 
calculated  as  follows: 

•  If  event,  is  hidden  (as  discussed  in  section  7)  then  PfiConl )  =  P(Con)*P,  and  P(Con2) 
=  P(Con)*(l-P,).  The  calculation  of  the  probability  of  hidden  events  is  described  in  a 
companion  paper 

•  If  condition,  is  hidden,  then  we  calculate  P(condition ,),  the  probability  of  the 
condition,  as  a  function  of  the  probabilities  of  its  constituent  variables  using  standard 
probability  calculations.  For  example,  if  condition,  is  hmmState  =  BUSINESS  || 
hmmState  =  INVESTMENT  then  Picondition ,)  =  PQimmState  =  BUSINESS)  + 
P(hmmState  =  INVESTMENT),  where  each  term  is  taken  from  the  POD  at  time  t, 
such  as  0.1  and  0.05  respectively,  using  POD- 1. 

We  set  P( Con  /  )=P( Con  )P(conditiont),  and  P( Con2)=P( Con)(  1  -P(condition,j). 

Let  PS{Con )  denote  Con’s  present-state.  PS(Conl)  and  PS{Con2)  are  determined 
as  follows: 

•  If  event t  is  hidden  then  PS(Conl)  is  the  next  state  detennined  by  the  pattern’s 
transition  out  of  PS(C'on),  under  the  assumption  that  the  event  fired,  and  PS{Con2)= 
PS(Con). 

•  If  condition,  is  hidden  (e.g.,  hmmState==?ERS>(JNAV  condition  in  Fig.  2),  then 
PS(Conl)  is  calculated  assuming  condition,=true  and  PS{Con2)  is  calculated 
assuming  condition, =false, 
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For  the  sake  of  simplicity  we  disallow  patterns  in  which  both  event t  and  conditiont 
are  hidden. 

C  configurations  are  routinely  (i.e.,  every  cycle  t)  managed  as  follows.  All 
configurations  Con '  with  the  same  present-state2  are  merged  into  a  single  configuration 
Conmerged,  using  the  sum  of  all  P(Con ')  as  P{Conmerged). 

The  statechart-pattern  declares  a  probability  of  flag  (POP),  i.e.,  the  probability  its 
corresponding  NL  requirement  has  been  flagged,  on  a  cycle  by  cycle  basis,  being  the  sum 
of  all  P{Con)  for  all  configurations  Con  such  that  PS(Con )  is  an  flag  state. 

Note  that  statechart-pattems  typically  use  sink-states  as  flag  states,  sink-states 
being  states  with  no  outgoing  transitions.  For  such  patterns,  the  POF  is  monotonically 
increasing  with  time. 

7.  CALCULATING  THE  POD  OF  A  HIDDEN  ARTIFACT 

We  propose  three  techniques  for  estimating  the  POD  at  time  t:  the  alpha,  gamma, 
and  delta  methods,  as  follows. 

•  The  alpha  method,  which  uses  N  values  of  a  Aj)=P(qf=Si\0\02...0t,  X),  one  per 
symbol  s /,  1  <i<N.  Note  that  £i <i<Na  fi)  =  1. 

•  The  gamma  method,  which  uses  N  values  of  yt(i)=P{qt=Si\0\02...0T,  X),  one  per 
symbol  s 1  <i<N.  Note  that  Xi <i<NYt(i)  =  1. 

•  The  delta  method,  which  uses  N  values  of: 

§t"(0  =  8tT/)/Ii<,<vSi'(/),  where 

St' (0  =  razx[qi,q2,...  qt-i]P(qi,q2,...qt=Si\  0i02...0t,X),  where 
P(qi,q2,...qt=Si\  0\02...0t  ,X)=dt(i)/V(0\02...0t).  In  other  words,  St"(0  is  a 
nonnalized  version  of  8 1'  f  / ) ,  which  in  turn  is  the  probability  of  the  HMM  generating 
symbol  s ,•  at  time  t,  via  the  most  probable  state  sequence,  given  the  observation. 

The  gamma  method  is  a  backward-forward  algorithm;  it  therefore  requires  the 
entire  observable  sequence  O1O1...O/  for  the  evaluation  of  y fi)  for  t  <  T.  The  alpha  and 
delta  methods  on  the  other  hand,  are  forward  algorithms  and  therefore  do  not  require 
future-time  information. 

When  the  HMM  contains  transitions  with  probability  0,  then  all  three  methods 
might  induce  sequences  of  symbols  that  cannot  be  physically  generated.  For  example, 
consider  an  an  HMM  with  N=  3  and  ai(2=0,  and  suppose  yy(l)=0.3  and  y,+i(2)=0.2;  The 
pattern  then  considers  the  sequence  S/,  s^  as  possible,  having  a  positive  probability  of 
0.06. 

8.  CONCLUSION  AND  FUTURE  RESEARCH 

We  have  demonstrated  a  technique  for  perfonning  financial  pattern  detection  in 
the  presence  of  hidden  financial  data.  Our  technique  induces  a  workflow  for  developing 
the  components  of  the  architecture  of  Fig.  3  -  depicted  in  Fig.  4. 

2  More  accurately,  PS(Con)  is  an  extended  state  vector,  that  includes  the  state  variable  and  the  states  of 
all  local  variables,  such  as  the  timer  state  and  the  bFlag  flag. 
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Figure  4.  Workflow  for  developing  the  pattern  matching  components  of  Fig.  3. 

We  are  planning  to  build  a  special  StateRover  code-generator  that  generates 

weighted/probabilistic  implementation  code  for  statechart  patterns. 
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